Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce number of zio free threads #534

Closed
wants to merge 1 commit into from

Conversation

nedbass
Copy link
Contributor

@nedbass nedbass commented Jan 13, 2012

As described in Issue #458, unlinking large amounts of data can cause
the threads in the zio free wait queue to start spinning. Reducing
the number of z_fr_iss threads from a fixed value of 100 to 1 per cpu
signficantly reduces contention on the taskq spinlock and improves
throughput.

Instrumenting the taskq code showed that __taskq_dispatch() can spend
a long time holding tq->tq_lock if there are a large number of threads
in the queue. It turns out the time spent in wake_up() scales
linearly with the number of threads in the queue. When a large number
of short work items are dispatched, as seems to be the case with
unlink, the worker threads drain the queue faster than the dispatcher
can fill it. They then all pile into the work wait queue to wait for
new work items. So if 100 threads are in the queue, wake_up() takes
about 100 times as long, and the woken threads have to spin until the
dispatcher releases the lock.

Reducing the number of threads helps with the symptoms, but doesn't
get to the root of the problem. It would seem that wake_up()
shouldn't scale linearly in time with queue depth, particularly if we
are only trying to wake up one thread. In that vein, I tried making
all of the waiting processes exclusive to prevent the scheduler from
iterating over the entire list, but I still saw the linear time
scaling. So further investigation is needed, but in the meantime
reducing the thread count is an easy workaround.

As described in Issue openzfs#458 and openzfs#258, unlinking large amounts of data
can cause the threads in the zio free wait queue to start spinning.
Reducing the number of z_fr_iss threads from a fixed value of 100 to 1
per cpu signficantly reduces contention on the taskq spinlock and
improves throughput.

Instrumenting the taskq code showed that __taskq_dispatch() can spend
a long time holding tq->tq_lock if there are a large number of threads
in the queue.  It turns out the time spent in wake_up() scales
linearly with the number of threads in the queue.  When a large number
of short work items are dispatched, as seems to be the case with
unlink, the worker threads drain the queue faster than the dispatcher
can fill it.  They then all pile into the work wait queue to wait for
new work items.  So if 100 threads are in the queue, wake_up() takes
about 100 times as long, and the woken threads have to spin until the
dispatcher releases the lock.

Reducing the number of threads helps with the symptoms, but doesn't
get to the root of the problem.  It would seem that wake_up()
shouldn't scale linearly in time with queue depth, particularly if we
are only trying to wake up one thread.  In that vein, I tried making
all of the waiting processes exclusive to prevent the scheduler from
iterating over the entire list, but I still saw the linear time
scaling.  So further investigation is needed, but in the meantime
reducing the thread count is an easy workaround.
@behlendorf
Copy link
Contributor

Merged as commit 08d08eb , we'll have to see if we can't still further improve things down in the taskq implementation.

@behlendorf behlendorf closed this Jan 17, 2012
behlendorf added a commit to behlendorf/zfs that referenced this pull request May 21, 2018
This implementation of rw_tryupgrade() behaves slightly differently
from its counterparts on other platforms.  It drops the RW_READER lock
and then acquires the RW_WRITER lock leaving a small window where no
lock is held.  On other platforms the lock is never released during
the upgrade process.  This is necessary under Linux because the kernel
does not provide an upgrade function.

There are currently no callers in the ZFS code where this change in
behavior is a problem.  In fact, in most cases the code is already
written such that if the upgrade fails the RW_READER lock is dropped
and the caller blocks waiting to acquire the lock as RW_WRITER.

Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Matthew Thode <[email protected]>
Closes openzfs#4388
Closes openzfs#534
pcd1193182 pushed a commit to pcd1193182/zfs that referenced this pull request Sep 26, 2023
pcd1193182 pushed a commit to pcd1193182/zfs that referenced this pull request Sep 26, 2023
Mismerge introduced by openzfs#534 DLPX-82101
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants